CS484 - CS555 Introduction to Computer Vision
Dr. Sedat OZER
Submission Deadline:</b> 23:59, November 1st, 2020 - After that time, late submissions will receieve 25% penalty per additional day!
TA e-mails:
Aziza: az.saber1995@gmail.com
Furkan: furkan.huseyin@bilkent.edu.tr
Submission link: (Submit your HW through this link:) https://forms.gle/njHwSn7WLSk5S7CW9
It is strongly advised that you first go through the links below and familiarize yourself with the Colab environment.
You should first load this HW file (HW1.ipynb) on Google Colab ( https://colab.research.google.com ) and run it on Google Colab which acts as a remote server since we will also test your code on Colab. Then go through the links below to get a better understanding of Colab environment.
Useful links (You should go through the links below to get introductory material for Colab and image processing with Python):
https://web.eecs.umich.edu/~justincj/teaching/eecs498/FA2020/colab.html
https://colab.research.google.com/github/cs231n/cs231n.github.io/blob/master/python-colab.ipynb
https://colab.research.google.com/drive/1b8pVMMoR37a3b9ICo8TMqMLVD-WvbzTk
There are many Colab tutorials available on Youtube. Here is a short video: https://www.youtube.com/watch?v=inN8seMm7UI and another video: https://www.youtube.com/watch?v=i-HnvsehuSw
OpenCV is a big and well-known library containing a large set of Computer Vision and Image Processing related functions. In this part of the assignment, you will learn to apply some basic functions of opencv on images.
First read the Q1.png and then, reduce its size by dividing its height and width by 3 and display that new resized image (with figure name 'resized image'). You need to save your resized image as img01_1.jpg.
Next, convert your original input image (Q1.png) into grayscale and do the same operations (with figure name 'gray and resized image') on the grayscale image and display it.
For loading an image on Colab - you can check the following two sites to get ideas:
import cv2 as cv
import matplotlib.pylab as plt
from google.colab import drive
from google.colab.patches import cv2_imshow
import numpy as np
drive.mount('/content/drive', force_remount=True)
Mounted at /content/drive
image = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q1.png')
# PART 1
print('\nresized image') # cv2_imshow does not have an attribute for figure names.
resized_image = cv.resize(image, (int(image.shape[1]/3), int(image.shape[0]/3))) # dividing sizes by 3
cv2_imshow(resized_image)
cv.imwrite('img01_1.jpg',resized_image) # saving
#PART 2
print('\ngray and resized image')
gray_image = cv.cvtColor(image, cv.COLOR_BGR2GRAY) # converting to grayscale
gray_resized_image = cv.resize(gray_image, (int(gray_image.shape[1]/3), int(gray_image.shape[0]/3))) # dividing sizes by 3
cv2_imshow(gray_resized_image)
resized image
gray and resized image
In this part, you will compute and display the histogram of a grayscale image. A histogram is a graph that shows the total number of pixels for each intensity value of an image. It gives you an idea about the intensity distribution of a given image.
Q1.png, Q2.png, Q2_2.pngThen compare and discuss your results in the text box that was given below the code box.
##put and run your code for Question 2 here
# first read the images: Q1.png, Q2.png and Q2_2.png
im02 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q1.png')
im03 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2.png')
im04 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2_2.png')
im02_gray = cv.cvtColor(im02, cv.COLOR_BGR2GRAY)
im03_gray = cv.cvtColor(im03, cv.COLOR_BGR2GRAY)
im04_gray = cv.cvtColor(im04, cv.COLOR_BGR2GRAY)
#calculate and plot the histogram of each image (with titles Q1, Q2, Q2_2 respectively), and display their histograms.
def comp_hist(img):
hist = np.zeros(256)
#### Computing Histogram ####
# h = img.shape[0]
# w = img.shape[1]
# for i in range(h):
# for j in range(w):
# hist[img[i][j]] += 1
#### Different Approach ####
img = img.flatten()
for i in img:
hist[i] += 1
return hist
hist02 = comp_hist(im02_gray)
hist03 = comp_hist(im03_gray)
hist04 = comp_hist(im04_gray)
# You can use the sample code below to COMPARE your own implementation of histogram
#____________________
plt.figure(figsize=([15, 5]))
plt.subplot(231), plt.hist(im02_gray.ravel(),bins = 256, range = [0,256]), plt.title('im02 Given Histogram'), plt.xlabel('Intensity'), plt.ylabel('Frequency')
plt.subplot(232), plt.hist(im03_gray.ravel(),bins = 256, range = [0,256]), plt.title('im03 Given Histogram'), plt.xlabel('Intensity')
plt.subplot(233), plt.hist(im04_gray.ravel(),bins = 256, range = [0,256]), plt.title('im04 Given Histogram'), plt.xlabel('Intensity')
#_____________________
# MY IMPLEMENTATION
plt.subplot(234), plt.title('im02 My Histogram'), plt.plot(hist02), plt.fill_between(range(256),hist02), plt.ylabel('Frequency'), plt.xlabel('Intensity')
plt.subplot(235), plt.title('im03 My Histogram'), plt.plot(hist03), plt.fill_between(range(256),hist03), plt.xlabel('Intensity')
plt.subplot(236), plt.title('im04 My Histogram'), plt.plot(hist04), plt.fill_between(range(256),hist04), plt.xlabel('Intensity')
plt.tight_layout() # plots were overlapping
plt.show()
I approached the histogram in both 2D and 1D ways wich expectedly they gave the same result. The plots provided and my own plots were identical except mine were a bit smoothed. It is probably caused by a hidden variable of plt.plot() which is a library I'm not very familiar with.
In this part, you will implement histogram equalization (for a grayscale image) to enhance the contrast in an image. Histogram equalization distributes the existing intensities on an image to a larger range so that all intensity values are used, as much as possible. Use images: Q2.png and Q2_2.png for this question.
Based on the results you obtain, discuss whether it is a good idea to use histogram equalization on all images in general.
You can not use openCV's (or any other libraries') built-in function to equalize histograms; you need to implement the histogram equalization part by yourself here. Also, you are not allowed to compute the histogram using built-in histogram functions either. You need to use your own implemantation of histogram computing from the previous question.
Helpful sites:
# implement and run your code for Question 3a below
# display the original grayscale image and the result with titles "original grayscale image", "equalized image" respectively
# display them in subplots, in which the first subplot is original image,
# second subplot is the result after applyiny histogram equalization.
#_____________________________
im2 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2.png')
im3 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2_2.png')
im2 = cv.cvtColor(im2, cv.COLOR_BGR2GRAY)
im3 = cv.cvtColor(im3, cv.COLOR_BGR2GRAY)
def hist_eq(img):
hist = comp_hist(img)
cdf = np.cumsum(hist) # Numpy cumulitive summation
cdf_m = np.ma.masked_equal(cdf,0) # Masking out zeros
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min()) # Normalizing the CDF
cdf_final = np.ma.filled(cdf_m,0).astype('uint8') # We cant use floating point values in images
img_final = cdf_final[img]
return img_final
im2_final = hist_eq(im2)
im3_final = hist_eq(im3)
plt.figure(figsize = [30,10])
plt.subplot(221), plt.imshow(im2, cmap='gray'),plt.title('Original Grayscale Image')
plt.subplot(222), plt.imshow(im3, cmap='gray'),plt.title('Original Grayscale Image')
plt.subplot(223), plt.imshow(im2_final, cmap='gray') ,plt.title('Equalized Image')
plt.subplot(224), plt.imshow(im3_final, cmap='gray'),plt.title('Equalized Image')
plt.tight_layout()
plt.show()
# Show both images and their histogram equalized versions below
From my understanding Histogram Equalization does basically "normalization" to the image. It is very good when the input contains close contrast values, then the method increases the global contrast of the image. This allows for areas of lower local contrast to gain a higher contrast. I think it is generally it is a good method to use for all images.
In this part, you will compute the histogram equalization again on the same images that you used in the previous part (Q2.png, Q2_2.png). The only difference is that, this time, you are allowed to use OpenCV's built-in histogram equalization function.
After you apply the histogram equalization on both images, compare your results with the results obtained in the previous question (Question 3a).
Use subplots for showing the four results together. The first subplot is the result image from part "a" for image Q2.png with title 'Q2 equalized'. The second subplot is the result from part "a" for image Q2_2.png with title 'Q2_2 equalized'. The third subplot is the result image from part "b" for image Q2.png with title 'Q2 equalized OpenCV'. The fourth subplot is the result image from part "b" for image Q2_2.png with title 'Q2_2 equalized OpenCV'.
Also, plot the histogram of Q2_2.png before applying histogram equalization and the histogram of resulting image after applying histogram equalization on the same image. Show both histograms side by side in a single Figure.
Helpful sites:
# put and run your code for Question 3b here
cv_eq2 = cv.equalizeHist(im2)
cv_eq3 = cv.equalizeHist(im3)
plt.figure(figsize = [30,10])
plt.subplot(221), plt.imshow(im2_final, cmap = 'gray') ,plt.title('Q2 equalized')
plt.subplot(222), plt.imshow(im3_final, cmap = 'gray'),plt.title('Q2_2 equalized')
plt.subplot(223), plt.imshow(cv_eq2, cmap='gray'),plt.title('Q2 equalized OpenCV')
plt.subplot(224), plt.imshow(cv_eq3, cmap='gray'),plt.title('Q2_2 equalized OpenCV')
plt.tight_layout()
plt.show()
# Only using my implementation because on Question 2 we can see it is accurate.
hist3_after = comp_hist(im3_final)
hist3_before = comp_hist(im3)
plt.figure(figsize = [15,5])
plt.subplot(121), plt.title('Q2_2 Before Histogram'), plt.plot(hist3_before), plt.fill_between(range(256),hist3_before), plt.ylabel('Frequency'), plt.xlabel('Intensity')
plt.subplot(122), plt.title('Q2_2 After Histogram'), plt.plot(hist3_after), plt.fill_between(range(256),hist3_after), plt.xlabel('Intensity')
plt.tight_layout()
plt.show()
binary_image = otsu_threshold(source_image)
Separate the background from the foreground using your implementation of Otsu’s algorithm. Show the binary mask and the original input image side by side. Test your results on both Q2.png and Q2_2.png files. Discuss your results, are they always perfect? Why so, why not?
Use image: Q1.png and after applying Otsu's thresholding on the image, show only the clouds by removing the content of tree from the image (i.e., do not show the actual pixels that belog to a tree). (All the pixels that belong to a tree can be artificially colored with a fix color value).
Useful links:
# implement your function below.
def otsu_threshold(source_image):
max_betw_var = -np.inf
threshold = 0
hist = comp_hist(source_image)
hist = hist/hist.max() # Normalization because numbers can be really big.
histcs = np.cumsum(hist)
for i in range(256):
w_b = histcs[i]/histcs[255]
m_b = 0
for j in range(i):
m_b += j*hist[j]
m_b = m_b/histcs[i]
w_f = (histcs[255]-histcs[i])/histcs[255]
m_f = 0
for j in range(1, 256-i):
m_f += (i+j)*hist[i+j]
m_f = m_f/(histcs[255]-histcs[i])
between_var = w_b*w_f*(m_b - m_f)*(m_b - m_f)
if between_var > max_betw_var:
max_betw_var = between_var
threshold = i
threshold = threshold - 1 # I do position vise calculations, needs to be corrected for index vise calculations.
binary_image = np.zeros(source_image.shape)
for h in range(source_image.shape[0]):
for w in range(source_image.shape[1]):
if source_image[h][w] > threshold:
binary_image[h][w] = 255
return binary_image, threshold
img1 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2.png')
img1 = cv.cvtColor(img1, cv.COLOR_BGR2GRAY)
img2 = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q2_2.png')
img2 = cv.cvtColor(img2, cv.COLOR_BGR2GRAY)
bin_img1, threshold1 = otsu_threshold(img1)
bin_img2, threshold2 = otsu_threshold(img2)
plt.figure(figsize = [21,7])
plt.subplot(221), plt.imshow(img1, cmap = 'gray') ,plt.title('Q2 Before Otsu')
plt.subplot(222), plt.imshow(bin_img1, cmap='gray'),plt.title('Q2 After Otsu')
plt.subplot(223), plt.imshow(img2, cmap = 'gray'),plt.title('Q2_2 Befor Otsu')
plt.subplot(224), plt.imshow(bin_img2, cmap='gray'),plt.title('Q2_2 After Otsu')
plt.tight_layout()
plt.show()
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:19: RuntimeWarning: invalid value encountered in double_scalars /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:25: RuntimeWarning: invalid value encountered in double_scalars
Since the "background" of an image are usually sky, sea, concrete etc, they have very similar pixel-values accross the surface. The "front" of an image however usually has different pixel-values from this large same valued area. Hence determining a threshold which can distinguish this difference can seperate the image as "background" and "front". Are they always perfect? I don't think so as there might not even be a large background surface on an image like a picture of a ballpit. Are they effective for most everyday pictures? I think yes, as seen in the examples above. This method can distinguish trees and Sydney Opera House from the background.
# Here you can test your function on the given binary image and see the result
img = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/Q1.png')
# img = cv.resize(img, (int(img.shape[1]/4), int(img.shape[0]/4)))
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
binary_image, threshold = otsu_threshold(img_gray)
invert_bin_img = 255-binary_image
r,g,b = cv.split(img)
masked_r = r + invert_bin_img
mask = masked_r.flatten()
mask[mask>255] = 255
masked_r = np.reshape(mask,img_gray.shape)
masked_g = g - invert_bin_img
mask = masked_g.flatten()
mask[mask<0] = 0
masked_g = np.reshape(mask,img_gray.shape)
masked_b = b + invert_bin_img
mask = masked_b.flatten()
mask[mask>255] = 255
masked_b = np.reshape(mask,img_gray.shape)
masked_img = img.copy()
masked_img[:,:,0] = masked_r
masked_img[:,:,1] = masked_g
masked_img[:,:,2] = masked_b
print('\nImage before removing the content of tree')
cv2_imshow(img)
print('\nImage after removing the content of tree')
cv2_imshow(masked_img)
print('(all the contents of the tree are colored magenta(red+blue). Showing only the clouds)')
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:25: RuntimeWarning: invalid value encountered in double_scalars
Image before removing the content of tree
Image after removing the content of tree
(all the contents of the tree are colored magenta(red+blue). Showing only the clouds)
# # See the OpenCV sample below that computes and compares three different thresholding technique on the same input image.
# # This code example uses OpenCV's built-in functions. HOWEVER, you need to implement OTSU's technique by yourself.
# # the first thresholding below is the standard thresholding technique where we enter the threshold values manually,
# # the second thresholding below is the OTSU's technique.
# # Change the image path according to your own Google drive structure!
# # global thresholding
# ret1,th1 = cv.threshold(img,10,255,cv.THRESH_BINARY)
# # Otsu's thresholding
# ret2,th2 = cv.threshold(img,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
# # plot all the images and their histograms
# images = [img, 0, th1,
# img, 0, th2]
# titles = ['Original Image','Histogram','Manuel Thresholding',
# 'Original Image','Histogram',"Otsu's Thresholding"]
# plt.imshow(th1,'gray')
# for i in [0,1]:
# plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
# plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
# plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
# plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
# plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
# plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
# plt.show()
# #Now lets show the result obtained with Otsu's technique in a larger frame.
# height, width = th2.shape[:2]
# th2 = cv.resize(th2, (int(width/4), int(height/4)))
# cv2_imshow(th2)
In this section, you will first implement Dilation operator as a function. In your function, you will take two inputs (as matrices): (i) the input image as a matrix and (ii) a structuring element (also as a matrix), and produce a binary image (another matrix) as output. Note that the resulting image matrix should be of the same size as input image matrix. As before, in your function, your methods should be generic, that is it should work with grayscale image and structuring element of any size. Your method declaration should look as follows:
dilated_image = dilation (source_image , structuring_el )
You should generate the structuring element (kernel) as a binary image with an arbitrary shape (you can use 3 by 3 square matrix with all 1s for testing).
Given the structuring element, your code should implement the dilation operation by using the definitions given in the course slides. Note that the structuring element should be created (as a matrix) outside and given as an input to the dilation/erosion codes so that your code can work with any kind of structuring element. You can assume that the origin is in the center, or get the coordinate of the origin as a separate input to your functions.
## implement your python function for dilation here
#
def dilation (source_image , structuring_el):
hi, wi = source_image.shape
hs, ws = structuring_el.shape
# Assuming origin is the center
hsc = int(hs/2)
wsc = int(ws/2)
temp_mat = np.zeros((hi+(2*hsc),wi+(2*wsc))) # Adding temporary edges to not get index errors.
temp_mat[hsc:hsc+hi,wsc:wsc+wi] = source_image # Copying the source into the center of temp
for i in range(hsc, hsc+hi):
for j in range(wsc, wsc+wi):
if source_image[i-hsc,j-wsc] > 0:
# If checks are for size compatibility operations.
if hs%2==1 and ws%2==1:
temp_mat[i-hsc:i+hsc+1,j-wsc:j+wsc+1] = temp_mat[i-hsc:i+hsc+1,j-wsc:j+wsc+1] + structuring_el # Basically putting the kernel on top of the image
elif hs%2==1 and ws%2==0:
temp_mat[i-hsc:i+hsc+1,j-wsc:j+wsc] = temp_mat[i-hsc:i+hsc+1,j-wsc:j+wsc] + structuring_el
elif hs%2==0 and ws%2==1:
temp_mat[i-hsc:i+hsc,j-wsc:j+wsc+1] = temp_mat[i-hsc:i+hsc,j-wsc:j+wsc+1] + structuring_el
else:
temp_mat[i-hsc:i+hsc,j-wsc:j+wsc] = temp_mat[i-hsc:i+hsc,j-wsc:j+wsc] + structuring_el
# Masking out the unwanted overlap of kernels
temp_shape = temp_mat.shape
temp_mat = temp_mat.ravel()
temp_mat[temp_mat>0] = 255
temp_mat = np.reshape(temp_mat,temp_shape)
# Cropping the largened image
dilated_image = temp_mat[hsc:hsc+hi,wsc:wsc+wi]
return dilated_image
# test your dilation function here on 3b.png with a 3x3 structuring element (matrix) with all 1s.
#
img = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/3b.png')
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
binary_image, threshold = otsu_threshold(img_gray)
ste = np.ones((3,3),np.uint8)
print('\nMy Dilation implementation')
cv2_imshow(dilation(binary_image,ste))
print('\nCV2 Built in Dilation')
cv2_imshow(cv.dilate(binary_image,ste))
print('\nChecking if they are equal for each pixel:')
print((dilation(binary_image,ste)==cv.dilate(binary_image,ste)).all())
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:19: RuntimeWarning: invalid value encountered in double_scalars /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:25: RuntimeWarning: invalid value encountered in double_scalars
My Dilation implementation
CV2 Built in Dilation
Checking if they are equal for each pixel: True
Similar to the previous question, you will also implement a morphological operator here. This time, you will implement Erosion operator as a function. In your function, you will take two inputs (as matrices): (i) the input image as a matrix and (ii) a structuring element (also as a matrix), and produce a binary image (another matrix) as output. Note that the resulting image matrix should be of the same size as input image matrix. As before, in your function, your methods should be generic, that is it should work with grayscale image and structuring element of any size. Your method declaration should look as follows:
eroded_image = erosion ( source_image , structuring_el )
You should generate the structuring element (kernel) as a binary image with an arbitrary shape (you can use 3 by 3 square matrix with all entries being 1 for testing).
Given the structuring element, your code should implement the erosion operation by using the definitions given in the course slides. Note that the structuring element should be created (as a matrix) outside and given as an input to the dilation/erosion codes so that your code can work with any kind of structuring element. You can assume that the origin is in the center, or get the coordinate of the origin as a separate input to your functions.
# implement your erosion operator here
#
def erosion (source_image, structuring_el):
temp_mat = np.zeros(source_image.shape)
hi, wi = source_image.shape
hs, ws = structuring_el.shape
hsc = int(hs/2)
wsc = int(ws/2)
row_loc, col_loc = np.where(structuring_el == 1) # Locating the 1-pixel values in structuring element
temp_mat_large = np.zeros((hi+(2*hsc),wi+(2*wsc))) # Adding temporary edges to not get index errors.
temp_mat_large[hsc:hsc+hi,wsc:wsc+wi] = source_image # Copying the source into the center of temp
for i in range(hsc,hi+hsc):
for j in range(wsc,wi+wsc):
streak = False # All have to be contained hence a streak boolean that will be switched when streak breaks.
for k in range(row_loc.size):
if temp_mat_large[row_loc[k]-hsc+i,col_loc[k]-wsc+j] > 0: # Chencking the locations on the structuring element and source image
streak = True
else:
streak = False
if not streak: # Should not continue if even one is not contained.
break
if streak: # If streak is True through the loop, then that pixel should be 1-pixel
temp_mat[i-hsc,j-wsc] = 255
return temp_mat
# eroded_image = np.where(temp_mat == 1, 255, temp_mat)
# return eroded_image
# test your erosion function here on a binary image with a 3x3 structuring element (matrix) with all 1s.
#
img = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/3b.png')
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
binary_image, threshold = otsu_threshold(img_gray)
ste = np.ones((3,3),np.uint8)
print('\nMy Erosion implementation')
cv2_imshow(erosion(binary_image,ste))
print('\nCV2 Built in Erosion')
cv2_imshow(cv.erode(binary_image,ste))
print('\nChecking if they are equal for each pixel:')
print((erosion(binary_image,ste)==cv.erode(binary_image,ste)).all())
print('\nChecking the conflicting pixels:')
print(np.where((erosion(binary_image,ste)==cv.erode(binary_image,ste))==False))
print('\n NOTE: From the lectures and the lecture slides I thought because of the nature of the erosion operation, the very edges of the resulting image should always be 0.\n But CV2 erode operation have 1-pixel values on the very edges hence I did not get 100% accuracy.')
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:19: RuntimeWarning: invalid value encountered in double_scalars /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:25: RuntimeWarning: invalid value encountered in double_scalars
My Erosion implementation
CV2 Built in Erosion
Checking if they are equal for each pixel:
False
Checking the conflicting pixels:
(array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 30, 31, 35, 36, 37, 139, 140,
189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 205,
206, 207, 208, 209, 210, 211, 216, 217, 218, 219, 220,
221, 222, 223, 224, 225, 226, 227, 228, 229, 366, 367,
368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378,
424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434,
435, 436, 449, 450, 451, 452, 453, 454, 455, 456, 457,
458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468,
469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479,
597, 598, 599, 631, 632, 633, 634, 635, 636, 637, 638,
639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649,
650, 651, 652, 659, 660, 661, 662, 663, 664, 665, 679,
680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690,
691, 692, 693, 694, 695, 696, 697, 698, 699, 718, 719,
720, 721, 722, 723, 724, 725, 726, 727, 728, 791, 792,
793, 794, 795, 796, 797, 922, 923, 924, 925, 926, 927,
928, 929, 930, 931, 932, 933, 934, 935, 946, 947, 976,
977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987,
988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998,
999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009,
1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020,
1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031,
1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210,
1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210, 1210]), array([ 137, 138, 139, 140, 910, 911, 912, 913, 914, 915, 916,
917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927,
928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938,
939, 940, 941, 942, 943, 944, 945, 946, 1763, 1764, 1765,
1766, 1767, 1768, 1769, 1770, 1771, 1772, 1773, 1774, 1775, 1779,
1887, 1904, 1905, 1906, 1926, 1926, 1926, 1926, 1926, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1926, 1926,
1926, 1926, 1926, 1926, 1926, 1926, 1926, 1926, 1926, 1926, 1926,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1926, 1926, 1926, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1926, 1926,
1926, 1926, 1926, 1926, 1926, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1926, 1926, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 547,
548, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836,
837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847,
848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858,
859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869,
870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880,
881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891,
892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902,
903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913,
914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924,
925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935,
936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946,
947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 1134,
1135, 1136, 1369, 1370, 1371, 1372, 1373, 1374, 1375, 1376, 1377,
1378, 1379, 1380, 1381, 1382, 1383, 1384, 1385, 1397, 1398, 1823]))
NOTE: From the lectures and the lecture slides I thought because of the nature of the erosion operation, the very edges of the resulting image should always be 0.
But CV2 erode operation have 1-pixel values on the very edges hence I did not get 100% accuracy.
In this question, you will use your previously implemented functions. You are provided with 2 gray scale images: 3a.png and 3b.png for this question. You are expected to apply a sequence of thresholding, reqion growing and/or morphological operations to segment and count distinct objects in the given image individually.
## Implement and run your code here
img3a = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/3a.png')
img3a = cv.cvtColor(img3a, cv.COLOR_BGR2GRAY)
img3b = cv.imread('/content/drive/My Drive/CS 484 Computer Vision/HW1/images/3b.png')
img3b = cv.cvtColor(img3b, cv.COLOR_BGR2GRAY)
### FIRST STEP
bin3a, thr3a = otsu_threshold(img3a)
bin3b, thr3b = otsu_threshold(img3b)
print('\nBest threshold for image 3a is: {}, for image 3b is: {}'.format(thr3a,thr3b))
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:19: RuntimeWarning: invalid value encountered in double_scalars /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:25: RuntimeWarning: invalid value encountered in double_scalars
Best threshold for image 3a is: 108, for image 3b is: 146
### SECOND STEP PART 1
okern7 = np.array([[0, 0, 0, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 0],
[0, 0, 0, 1, 0, 0, 0]], np.uint8)
tikern7 = np.array([[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0]], np.uint8)
ikern17 = np.identity(17)
skern11 = np.ones((11,11))
skern21 = np.ones((21,21))
print('\nOriginal Image')
cv2_imshow(bin3b)
print('\nMorphological Steps:')
plt.figure(figsize = [21,35])
img = erosion(bin3b, skern11)
plt.subplot(521), plt.imshow(img, cmap = 'gray'), plt.title('Erosion(k=skern11)')
img = dilation(img,ikern17)
plt.subplot(522), plt.imshow(img, cmap='gray'), plt.title('Dilation(k=ikern17)')
img = erosion(img,skern11)
plt.subplot(523), plt.imshow(img, cmap='gray'), plt.title('Erosion(k=skern11)')
img = dilation(img,skern11)
plt.subplot(524), plt.imshow(img, cmap='gray'), plt.title('Dilation(k=skern11)')
img = erosion(img,skern11)
plt.subplot(525), plt.imshow(img, cmap='gray'), plt.title('Erosion(k=skern11)')
img = dilation(img,skern11)
plt.subplot(526), plt.imshow(img, cmap='gray'), plt.title('Dilation(k=skern11)')
img = erosion(img,skern21)
plt.subplot(527), plt.imshow(img, cmap='gray'), plt.title('Erosion(k=skern11)')
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
plt.subplot(528), plt.imshow(img, cmap='gray'), plt.title('5*Dilation(k=okern7)')
img = dilation(img,tikern7)
img = dilation(img,tikern7)
img = dilation(img,tikern7)
plt.subplot(529), plt.imshow(img, cmap='gray'), plt.title('3*Dilation(k=tikern7)')
img = dilation(img,ikern17)
plt.subplot(5,2,10), plt.imshow(img, cmap='gray'), plt.title('Dilation(k=ikern17)')
plt.tight_layout()
plt.show()
print('\nResulting Image')
cv2_imshow(img)
Original Image
Morphological Steps:
Resulting Image
img = img.astype('uint8')
num_labels, labels = cv.connectedComponents(img) # Using CV2 connected components
print(f'\nNumber of Planes = {num_labels - 1}\n') # Excluding the background label
# Code from online, which paints the labels in different colors.
label_hue = np.uint8(255*labels/np.max(labels))
blank_ch = 255*np.ones_like(label_hue)
labeled_img = cv.merge([label_hue, blank_ch, blank_ch])
labeled_img = cv.cvtColor(labeled_img, cv.COLOR_HSV2BGR)
labeled_img[label_hue==0] = 0
cv2_imshow(labeled_img)
Number of Planes = 34
### SECOND STEP PART 2
print('\nOriginal Image')
cv2_imshow(bin3a)
# Inverting because we want the car objects, which are black.
img = 255-bin3a
print('\nMorphological Steps:')
plt.figure(figsize = [21,7])
img = erosion(img, skern21)
plt.subplot(121), plt.imshow(img, cmap = 'gray'), plt.title('Erosion(k=skern21)')
img = dilation(img,skern11)
plt.subplot(122), plt.imshow(img, cmap='gray'), plt.title('Dilation(k=skern11)')
plt.tight_layout()
plt.show()
print('\nResulting Image')
cv2_imshow(img)
Original Image
Morphological Steps:
Resulting Image
img = img.astype('uint8')
num_labels, labels = cv.connectedComponents(img) # Using CV2 connected components
print(f'\nNumber of Cars = {num_labels - 1}\n') # Excluding the background label
# Code from online, which paints the labels in different colors.
label_hue = np.uint8(255*labels/np.max(labels))
blank_ch = 255*np.ones_like(label_hue)
labeled_img = cv.merge([label_hue, blank_ch, blank_ch])
labeled_img = cv.cvtColor(labeled_img, cv.COLOR_HSV2BGR)
labeled_img[label_hue==0] = 0
cv2_imshow(labeled_img)
Number of Cars = 16
My approach was to first remove noise from the image. But while doing it I lost quite a bit of the plane pixels too. Because of that I applied dilation after each erosion to minimize the plane-pixel loss. Also realized if the kernel is an diagonal matrix such as an identity matrix, the dilation happens in \ way which then helped me to connect seperated parts of the same plane.
applying a large filter erosion which will wipe out the small noise
img = erosion(bin3b, skern11)
the resulting image has many planes which has gaps in them in / direction, hence I applied \ dilation to stretch the pixels to connect each other.
img = dilation(img,ikern17)
another a large filter erosion which will wipe out the small noise
img = erosion(img,skern11)
lost many plane-pixels too, applying dilation to regain them
img = dilation(img,skern11)
another a large filter erosion which will wipe out the small noise
img = erosion(img,skern11)
again regaining the loss.
img = dilation(img,skern11)
At this point realized small noise is gone but there are large noises caused by my dilations, applying a larger filter.
img = erosion(img,skern21)
Now all the unwanted pixels are gone and only plane-pixels left but they are scattered and occupy small area. Applying 5 dilations with small kernel to dilate precisely and high res. Also creating a 35x35 circle matrix was very hard.
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
img = dilation(img,okern7)
the resulting image has 5 planes which has gaps in them in \ direction, hence I applied / dilation to stretch the pixels to connect each other. Applying 3 dilations with small reverse identity kernel to connect two parts together.
img = dilation(img,tikern7)
img = dilation(img,tikern7)
img = dilation(img,tikern7)
Now only one plane which has gap in it remains. To connect it I applied a large identity matrix. This also made slim objects larger.
img = dilation(img,ikern17)
As result, cv2.connectedComponents() found 35 labels which include the background as one label hence the image contained 34 distinct objects. There were 34 planes in the image also. The plane locations of the result were precise with the original image aswell.
In this part the noise were small and scattered but the cars were large. But first the image needed to be inverted because the pixels of the wanted objects were 0 and noise 255.
img = 255-bin3a
Now applying erosion with a very large kernel will wipe out the noise which are small and will not fully remove the car-pixels because they are much larger than the noise.
img = erosion(img, skern21)
After removing all noise with this very large erosion, just needed to dilate the remaining car-pixels.
img = dilation(img,skern11)
As result, cv2.connectedComponents() found 17 labels which include the background as one label hence the image contained 16 distinct objects. There were 16 cars in the image also. The plane locations of the result were precise with the original image aswell.
$\;\;\;\;\;\;\;\;\;\;\;$ a $\;\;\;\;\;\;\;\;\;$ b $\;\;\;\;\;\;\;\;\;$ c $\;\;\;\;\;\;\;\;\;$ d
Q1
Q2
Q3
Q4
Q5
Q6
Total: /100